Your browser doesn't support javascript.
Show: 20 | 50 | 100
Results 1 - 7 de 7
Filter
1.
J Am Med Inform Assoc ; 30(7): 1305-1312, 2023 06 20.
Article in English | MEDLINE | ID: covidwho-2325541

ABSTRACT

Machine learning (ML)-driven computable phenotypes are among the most challenging to share and reproduce. Despite this difficulty, the urgent public health considerations around Long COVID make it especially important to ensure the rigor and reproducibility of Long COVID phenotyping algorithms such that they can be made available to a broad audience of researchers. As part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, researchers with the National COVID Cohort Collaborative (N3C) devised and trained an ML-based phenotype to identify patients highly probable to have Long COVID. Supported by RECOVER, N3C and NIH's All of Us study partnered to reproduce the output of N3C's trained model in the All of Us data enclave, demonstrating model extensibility in multiple environments. This case study in ML-based phenotype reuse illustrates how open-source software best practices and cross-site collaboration can de-black-box phenotyping algorithms, prevent unnecessary rework, and promote open science in informatics.


Subject(s)
Boxing , COVID-19 , Population Health , Humans , Electronic Health Records , Post-Acute COVID-19 Syndrome , Reproducibility of Results , Machine Learning , Phenotype
2.
Clin Transl Sci ; 16(3): 489-501, 2023 03.
Article in English | MEDLINE | ID: covidwho-2269278

ABSTRACT

Sepsis accounts for one in three hospital deaths. Higher concentrations of high-density lipoprotein cholesterol (HDL-C) are associated with apparent protection from sepsis, suggesting a potential therapeutic role for HDL-C or drugs, such as cholesteryl ester transport protein (CETP) inhibitors that increase HDL-C. However, these beneficial clinical associations might be due to confounding; genetic approaches can address this possibility. We identified 73,406 White adults admitted to Vanderbilt University Medical Center with infection; 11,612 had HDL-C levels, and 12,377 had genotype information from which we constructed polygenic risk scores (PRS) for HDL-C and the effect of CETP on HDL-C. We tested the associations between predictors (measured HDL-C, HDL-C PRS, CETP PRS, and rs1800777) and outcomes: sepsis, septic shock, respiratory failure, and in-hospital death. In unadjusted analyses, lower measured HDL-C concentrations were significantly associated with increased risk of sepsis (p = 2.4 × 10-23 ), septic shock (p = 4.1 × 10-12 ), respiratory failure (p = 2.8 × 10-8 ), and in-hospital death (p = 1.0 × 10-8 ). After adjustment (age, sex, electronic health record length, comorbidity score, LDL-C, triglycerides, and body mass index), these associations were markedly attenuated: sepsis (p = 2.6 × 10-3 ), septic shock (p = 8.1 × 10-3 ), respiratory failure (p = 0.11), and in-hospital death (p = 4.5 × 10-3 ). HDL-C PRS, CETP PRS, and rs1800777 significantly predicted HDL-C (p < 2 × 10-16 ), but none were associated with sepsis outcomes. Concordant findings were observed in 13,254 Black patients hospitalized with infections. Lower measured HDL-C levels were significantly associated with increased risk of sepsis and related outcomes in patients with infection, but a causal relationship is unlikely because no association was found between the HDL-C PRS or the CETP PRS and the risk of adverse sepsis outcomes.


Subject(s)
Sepsis , Shock, Septic , Adult , Humans , Cholesterol, HDL/genetics , Cholesterol, HDL/metabolism , Cholesterol Ester Transfer Proteins/genetics , Cholesterol Ester Transfer Proteins/metabolism , Hospital Mortality , Cholesterol, LDL/metabolism , Sepsis/genetics
3.
Environ Adv ; 11: 100352, 2023 Apr.
Article in English | MEDLINE | ID: covidwho-2237542

ABSTRACT

Post-acute sequelae of SARS-CoV-2 infection (PASC) affects a wide range of organ systems among a large proportion of patients with SARS-CoV-2 infection. Although studies have identified a broad set of patient-level risk factors for PASC, little is known about the association between "exposome"-the totality of environmental exposures and the risk of PASC. Using electronic health data of patients with COVID-19 from two large clinical research networks in New York City and Florida, we identified environmental risk factors for 23 PASC symptoms and conditions from nearly 200 exposome factors. The three domains of exposome include natural environment, built environment, and social environment. We conducted a two-phase environment-wide association study. In Phase 1, we ran a mixed effects logistic regression with 5-digit ZIP Code tabulation area (ZCTA5) random intercepts for each PASC outcome and each exposome factor, adjusting for a comprehensive set of patient-level confounders. In Phase 2, we ran a mixed effects logistic regression for each PASC outcome including all significant (false positive discovery adjusted p-value < 0.05) exposome characteristics identified from Phase I and adjusting for confounders. We identified air toxicants (e.g., methyl methacrylate), particulate matter (PM2.5) compositions (e.g., ammonium), neighborhood deprivation, and built environment (e.g., food access) that were associated with increased risk of PASC conditions related to nervous, blood, circulatory, endocrine, and other organ systems. Specific environmental risk factors for each PASC condition and symptom were different across the New York City area and Florida. Future research is warranted to extend the analyses to other regions and examine more granular exposome characteristics to inform public health efforts to help patients recover from SARS-CoV-2 infection.

4.
J Am Med Inform Assoc ; 2022 Aug 25.
Article in English | MEDLINE | ID: covidwho-2231778

ABSTRACT

OBJECTIVE: COVID-19 survivors are at risk for long-term health effects, but assessing the sequelae of COVID-19 at large scales is challenging. High-throughput methods to efficiently identify new medical problems arising after acute medical events using the electronic health record (EHR) could improve surveillance for long-term consequences of acute medical problems like COVID-19. MATERIALS AND METHODS: We augmented an existing high-throughput phenotyping method (PheWAS) to identify new diagnoses occurring after an acute temporal event in the EHR. We then used the temporal-informed phenotypes to assess development of new medical problems among COVID-19 survivors enrolled in an EHR cohort of adults tested for COVID-19 at Vanderbilt University Medical Center. RESULTS: The study cohort included 186,105 adults tested for COVID-19 from March 5, 2020 to November 1, 2021; of which 30,088 (16.2%) tested positive. Median follow-up after testing was 412 days (IQR 274-528). Our temporal-informed phenotyping was able to distinguish phenotype chapters based on chronicity of their constituent diagnoses. PheWAS with temporal-informed phenotypes identified increased risk for 43 diagnoses among COVID-19 survivors during outpatient follow-up, including multiple new respiratory, cardiovascular, neurological, and pregnancy-related conditions. Findings were robust to sensitivity analyses, and several phenotypic associations were supported by changes in outpatient vital signs or laboratory tests from the pre-testing to post-recovery period. CONCLUSION: Temporal-informed PheWAS identified new diagnoses affecting multiple organ systems among COVID-19 survivors. These findings can inform future efforts to enable longitudinal health surveillance for survivors of COVID-19 and other acute medical conditions using the EHR.

5.
J Biomed Inform ; 117: 103777, 2021 05.
Article in English | MEDLINE | ID: covidwho-1171479

ABSTRACT

From the start of the coronavirus disease 2019 (COVID-19) pandemic, researchers have looked to electronic health record (EHR) data as a way to study possible risk factors and outcomes. To ensure the validity and accuracy of research using these data, investigators need to be confident that the phenotypes they construct are reliable and accurate, reflecting the healthcare settings from which they are ascertained. We developed a COVID-19 registry at a single academic medical center and used data from March 1 to June 5, 2020 to assess differences in population-level characteristics in pandemic and non-pandemic years respectively. Median EHR length, previously shown to impact phenotype performance in type 2 diabetes, was significantly shorter in the SARS-CoV-2 positive group relative to a 2019 influenza tested group (median 3.1 years vs 8.7; Wilcoxon rank sum P = 1.3e-52). Using three phenotyping methods of increasing complexity (billing codes alone and domain-specific algorithms provided by an EHR vendor and clinical experts), common medical comorbidities were abstracted from COVID-19 EHRs, defined by the presence of a positive laboratory test (positive predictive value 100%, recall 93%). After combining performance data across phenotyping methods, we observed significantly lower false negative rates for those records billed for a comprehensive care visit (p = 4e-11) and those with complete demographics data recorded (p = 7e-5). In an early COVID-19 cohort, we found that phenotyping performance of nine common comorbidities was influenced by median EHR length, consistent with previous studies, as well as by data density, which can be measured using portable metrics including CPT codes. Here we present those challenges and potential solutions to creating deeply phenotyped, acute COVID-19 cohorts.


Subject(s)
COVID-19/diagnosis , Electronic Health Records , Phenotype , Comorbidity , Diabetes Mellitus, Type 2 , Global Health , Humans , Influenza, Human , Likelihood Functions , Pandemics
6.
J Biomed Inform ; 117: 103748, 2021 05.
Article in English | MEDLINE | ID: covidwho-1152466

ABSTRACT

OBJECTIVE: Identifying symptoms and characteristics highly specific to coronavirus disease 2019 (COVID-19) would improve the clinical and public health response to this pandemic challenge. Here, we describe a high-throughput approach - Concept-Wide Association Study (ConceptWAS) - that systematically scans a disease's clinical manifestations from clinical notes. We used this method to identify symptoms specific to COVID-19 early in the course of the pandemic. METHODS: We created a natural language processing pipeline to extract concepts from clinical notes in a local ER corresponding to the PCR testing date for patients who had a COVID-19 test and evaluated these concepts as predictors for developing COVID-19. We identified predictors from Firth's logistic regression adjusted by age, gender, and race. We also performed ConceptWAS using cumulative data every two weeks to identify the timeline for recognition of early COVID-19-specific symptoms. RESULTS: We processed 87,753 notes from 19,692 patients subjected to COVID-19 PCR testing between March 8, 2020, and May 27, 2020 (1,483 COVID-19-positive). We found 68 concepts significantly associated with a positive COVID-19 test. We identified symptoms associated with increasing risk of COVID-19, including "anosmia" (odds ratio [OR] = 4.97, 95% confidence interval [CI] = 3.21-7.50), "fever" (OR = 1.43, 95% CI = 1.28-1.59), "cough with fever" (OR = 2.29, 95% CI = 1.75-2.96), and "ageusia" (OR = 5.18, 95% CI = 3.02-8.58). Using ConceptWAS, we were able to detect loss of smell and loss of taste three weeks prior to their inclusion as symptoms of the disease by the Centers for Disease Control and Prevention (CDC). CONCLUSION: ConceptWAS, a high-throughput approach for exploring specific symptoms and characteristics of a disease like COVID-19, offers a promise for enabling EHR-powered early disease manifestations identification.


Subject(s)
COVID-19/diagnosis , Natural Language Processing , Symptom Assessment/methods , Adult , Ageusia , COVID-19 Nucleic Acid Testing , Cough , Female , Fever , Humans , Male , Middle Aged , Pandemics , United States
7.
J Biomed Inform ; 113: 103657, 2021 01.
Article in English | MEDLINE | ID: covidwho-970257

ABSTRACT

OBJECTIVE: During the COVID-19 pandemic, health systems postponed non-essential medical procedures to accommodate surge of critically-ill patients. The long-term consequences of delaying procedures in response to COVID-19 remains unknown. We developed a high-throughput approach to understand the impact of delaying procedures on patient health outcomes using electronic health record (EHR) data. MATERIALS AND METHODS: We used EHR data from Vanderbilt University Medical Center's (VUMC) Research and Synthetic Derivatives. Elective procedures and non-urgent visits were suspended at VUMC between March 18, 2020 and April 24, 2020. Surgical procedure data from this period were compared to a similar timeframe in 2019. Potential adverse impact of delay in cardiovascular and cancer-related procedures was evaluated using EHR data collected from January 1, 1993 to March 17, 2020. For surgical procedure delay, outcomes included length of hospitalization (days), mortality during hospitalization, and readmission within six months. For screening procedure delay, outcomes included 5-year survival and cancer stage at diagnosis. RESULTS: We identified 416 surgical procedures that were negatively impacted during the COVID-19 pandemic compared to the same timeframe in 2019. Using retrospective data, we found 27 significant associations between procedure delay and adverse patient outcomes. Clinician review indicated that 88.9% of the significant associations were plausible and potentially clinically significant. Analytic pipelines for this study are available online. CONCLUSION: Our approach enables health systems to identify medical procedures affected by the COVID-19 pandemic and evaluate the effect of delay, enabling them to communicate effectively with patients and prioritize rescheduling to minimize adverse patient outcomes.


Subject(s)
COVID-19/epidemiology , Cardiovascular Diseases/diagnosis , Cardiovascular Diseases/surgery , Neoplasms/diagnosis , Neoplasms/surgery , Pandemics , Time-to-Treatment , Adult , COVID-19/virology , Female , Humans , Male , Middle Aged , Retrospective Studies , SARS-CoV-2/isolation & purification
SELECTION OF CITATIONS
SEARCH DETAIL